Smoothed marginal distribution constraints for language modeling
نویسندگان
چکیده
We present an algorithm for re-estimating parameters of backoff n-gram language models so as to preserve given marginal distributions, along the lines of wellknown Kneser-Ney (1995) smoothing. Unlike Kneser-Ney, our approach is designed to be applied to any given smoothed backoff model, including models that have already been heavily pruned. As a result, the algorithm avoids issues observed when pruning Kneser-Ney models (Siivola et al., 2007; Chelba et al., 2010), while retaining the benefits of such marginal distribution constraints. We present experimental results for heavily pruned backoff ngram models, and demonstrate perplexity and word error rate reductions when used with various baseline smoothing methods. An open-source version of the algorithm has been released as part of the OpenGrm ngram library.1
منابع مشابه
A Berry-Esseen Type Bound for a Smoothed Version of Grenander Estimator
In various statistical model, such as density estimation and estimation of regression curves or hazard rates, monotonicity constraints can arise naturally. A frequently encountered problem in nonparametric statistics is to estimate a monotone density function f on a compact interval. A known estimator for density function of f under the restriction that f is decreasing, is Grenander estimator, ...
متن کاملDetermination of Maximum Bayesian Entropy Probability Distribution
In this paper, we consider the determination methods of maximum entropy multivariate distributions with given prior under the constraints, that the marginal distributions or the marginals and covariance matrix are prescribed. Next, some numerical solutions are considered for the cases of unavailable closed form of solutions. Finally, these methods are illustrated via some numerical examples.
متن کاملAdaptive Language Modeling Using Minimum Discriminant Estimation
We present an algorithm to adapt a n-gram language model to a document as it is dictated. The observed partial document is used to estimate a unigram distribution for the words that already occurred. Then, we find the closest n-gram distribution to the static n.gram distribution (using the discrimination information distance measure) and that satisfies the marginal constraints derived from the ...
متن کاملDeveloping Non-linear Dynamic Model to Estimate Value at Risk, Considering the Effects of Asymmetric News: Evidence from Tehran Stock Exchange
Empirical studies show that there is stronger dependency between large losses than large profit in financial market, which undermine the performance of using symmetric distribution for modeling these asymmetric. That is why the assuming normal joint distribution of returns is not suitable because of considering the linier dependence, and can be lead to inappropriate estimate of VaR. Copula theo...
متن کاملThe Smoothed Dirichlet Distribution: Understanding Cross-entropy Ranking in Information Retrieval
THE SMOOTHED DIRICHLET DISTRIBUTION: UNDERSTANDING CROSS-ENTROPY RANKING IN INFORMATION RETRIEVAL SEPTEMBER 2006 RAMESH M. NALLAPATI B.Tech., INDIAN INSTITUTE OF TECHNOLOGY, BOMBAY M.S., UNIVERSITY OF MASSACHUSETTS AMHERST M.S., UNIVERSITY OF MASSACHUSETTS AMHERST Ph.D., UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Prof. James Allan Unigram Language modeling is a successful probabilistic fr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013